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By developing F. Smarandache thema on paradoxes in mathematics it is stated, 
firstly, if in measurement (natural science) experiments the best solutions are found 
by using methods of modem data analysis theory, then some difficulties with the 
interpretation of the computation results are liable to occur; secondly, one is not 
capable to overcome these difficulties without a data analysis theory modification, 
consisted in the translation of this theory from Aristotelian “binary logic” into more 


progressive “fuzzy logic”. 
Key words: data analysis, revealing outliers, confidence interval, fuzzy logic. 


1 Introduction 


As generally known from history of science, a scientific theory may have crisis in 
process of its development, when it disjoints in a set of fragment theories, that weak- 
coordinate each other and, as a whole, form a collection of various non-integrated 
conceptions. For instance, as we assume, F. Smarandache mathematical notions and 
questions '~* help us to understand quite well that a stable equilibrium, observed in 
mathematics at the present time, is no more than fantasy. Thus, it falls in exactly 
with F.Smarandache views that the finding and investigating paradoxes in 
mathematics is a very effective way of approximating to the truth and so at present 
each of scientific researches, continuing F.Smarandache thema’, should be 
considered as very actual one. 

Let us assume that computative paradoxes in mathematics are mainly such 
computation results, obtained by using mathematical methods, which are 
contradicted some mathematical statements. The main goal of this paper is to 
demonstrate that the mentioned crisis, demanding practical action instead of debate, 
occurs in modern data analysis, which formally has its own developed mathematical 
theory, but does not capable “to cope worthily” with a large number of practical 
problems of quantitative processing results of measurement experiments. 

Another goal of this paper is to equip the mathematicians and software 
designers, working in the data analysis field, with a set of examples, demonstrating 
dramatically that, if, for solving some problems on analysing data arrays, one uses 
the standard computer programmes and/or time-tested methods of modern data 
analysis theory, then a set of the paradoxical computative results may be obtained. 
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2 Approximative problems of data analysis 


2.1 The main problems of regression analysis theory and 
standard solution- methods 


As generally known*~’, for found experimental dependence {y,, xn} (n= 1, 2, ..., 
N) and given approximative function F(A, x), in the measurement (natural science) 
experiments the main problems of regression analysis theory are finding estimates of 
A’ and y’ and variances of 5A’ and &(y—y’), where A’ is an estimate of vector 
parameter A of the function F(A,x) and {y,’}= {F(A’,x,)}. In particular, if 
F(A, x) = LE ayhy(x) {F(A, x) is a linear model}, where Afx) are some functions 
on x, then in received regression analysis theory standard solution of discussed 
problems has form 


A’= (H'H)! Hy, (8A’)= s/(N—L) diag(H™H)~! (1) 


S(y—y’)=y’tt,sy 1+H;7(H™H 4; , 

where H is a matrix LXN in size with n-th row (Aj (xn), ho(xn), ... , Ax(X%n)); H! is the 
transposed matrix H; Y= {yn}; s = Diva —y4)? MN -D; Hi= (u(x), hod), 
..., Az(xj)); the value of t, is determined by ¢-Student distribution table and 
generally depends on the assigned value of the significance level of p and the value 
of N—L (a number of freedom degree); at the assigned value of the significance 
level of p the notation of 5,(y—y’) means confidence interval for possible 
deviations of experimental values of y from computed values y’= F(A’, x). 
According to Gauss - Markov theorem *’, for classical data analysis model 


yn = F(A, Xn) + en (2) 


the solution (1) is the best (gives minimum value of s), if the following conditions 
are fulfilled: 

all values of {Xn} are not random, mathematical expectation of random value 
{nf is equal to zero and random values of {e,} are non-correlated and have the 
same dispersions O°. 


Example 1. In table 1 we adduce an experimental data array, obtained by 
Russian chemist D.I.Mendeleev in 1881, when he investigated the solvability (y, 
relative units) of sodium nitrate (NaNO3) on the water temperature (x, °C). 


D.I.Mendeleev data array 


Xn 
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By analysing the data array {yn, x.}, presented in table 1, Y.V.Linnik? states that 
these data, as it was noted by D.I.Mendeleev, are well-fitted by linear model 
y’ =67.5+0.871x (8A’=(0.5;0.2)), although the correspondence between 
experimental and computed on linear model values of y is slightly getting worse at 
the beginning and end of investigated temperature region (see the values {y, —y%} 
adduced in table 1). We add that for discussed data array Y.V.Linnik? computes the 
confidence interval of 5, (y —y’) from (1) at the significance level of p = 0.9: 


So9 (y—y’) = £0.593 1+(x—-26)7 4511 . (3) 


Figure 1. The plots of confidence interval of the deviation of y from y’ (heavy lines) and 
residuals y — y’ (circles) for D.1.Mendeleev data array. 


We show the plots of 509 (v—y’) on x by heavy lines in figure 1 and {y, —y, 7 
Xn} by the circles. Since the plot of {y, —yn4 Xn} Steps over the heavy lines in figure 
1, some computative difficulty is revealed: 

the standard way (1), used by Y.V.Linnik’ for determining the confidence 
interval of the deviations of y from y* is out of character with the discussed 
experimental data array. 

It follows from results presented in table 1 and/or figure 1, if one assumes that 
5(y—y’) = max | y, —y,’| = 1.73 then the broken connections of the confidence 
interval 5(y—y’) with D.I.Mendeleev data array will be pieced up. But values of 
5A’, calculated by Y.V.Linnik from (1), disagree with the values 5(y—y’) > 1.73, 
and, consequently, 

standard values of 0A “is out of character with D.I.Mendeleev data array also. 


2.2 Alternative methods of regression analysis theory 


P.Huber® noted that, as the rule, 5 — 10% of all observations in the majority of 
analysing experimental arrays are anomalous or, in other words, the conditions of 
Gauss - Markov theorem, adduced above, are not fulfilled. Consequently, in practice 
instead of the standard solution (1), found by “least squares (LS) method”, 
alternative methods, developed in the frames of received regression analysis theory, 
should be used. In particular, if the data array {y,,x,} contains a set of 
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outliers, then for finding the best solution of discussed problem it is necessary® ’ or 
to remove all outliers from the analysing data array (strategy 1), or to compute the 
values of A’ on the initial data array by means of M-robust estimators (strategy 2). 
For revealing outliers in'the data array P.J.Rousseeuw and A.M.Leroy? suggest to 
use one of two combined statistical procedures, in which parameter estimates, 
minimising the median of the array {(y, — yn’)’} (the first procedure) or the sum of K 
first elements of the same array (the second procedure), are considered as the best 
ones. If F(A, x) is a linear function (see above), then the robust M-estimates of A’ 
are obtained as result of the solving of one from two minimisation problems °~° 


N N 
Sp (A) = X O(n — Yn) > Min or OS / da; = LX WOYn — Yn) hy(x,) = 0, (4) 
n=] n=l 


where function @(r) is symmetric concerning Y-axis, continuously differentiable 
with a minimum at zero and @(0) = 0; w(r) is a derivative of @(r) with respect to r. 


Continued example 1. Since D.I.Mendeleev data array from table 1 contains 
outliers, we adduce results of quantitative processing this data by alternative 
methods, defined above. 

1. Let in (4) Andrews function '® be applied: @(r) = d(1-cos(r/d)) if |r|<dz 
and (r)=0 if |rl> dz. It is articulate in figure 2 that in this case the values of the 
linear model parameters ap and a; depend on 

a) the values of parameter d of Andrews function (7); 

b) the type of the minimisation robust regression problem (solutions of the first 
and second minimisation problem of (4) are marked respectively by triangles and 
circles in figure 2). 

Thus, in this case a computative paradox declares itself in the fact, that 

in actual practice the robust estimates are not robust 

and so, as K.R.Draper and H.Smith’’ wrote already, 

“unreasoning application of robust estimators looks like reckless application of 
ridge-estimators: they can be useful, but can be improper also. The main problem is 
such one, that we do not know, which robust estimators and at which types of 
supposes about errors are effectual to applicate; but some investigations in this 
direction have been done...” 


a my 
67,4 eed 
67,1 0,88 
66,8 0,86 
-0,2 0,8 18 4 -0,2 0,8 18 4 


Figure 2. Dependences of parameters values of linear model a9+a,x on values internal 
parameter of robust Andrews estimator and the type of the minimisation problems (4). 
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2. Let us reveal outliers in D.I.Mendeleev data array by both combined statistical 
procedures’, mentioned above. 

Our computation results show 

a) both procedures could not find the all four outliers (1,7, 8 and 9) but the 
only three ones with numbers 1, 8 and 9; 

b) if a set of readings with numbers 1,8 and 9 is deleted from D.I.Mendeleev 
data array, then for the truncated data array the first procedure will not find a new 
outlier, but the second procedure will find two outliers yet that have numbers 2 and 
6 in the initial data array. 

Thus, in this case the main computative paradox 1s exhibited in the fact, that 

revealing outliers problems solutions depend on a type of the used Statistical 
procedures. 

It remains for us to add, if one 

a) computes y’ by formula ®’ 


VE, x) =go( 68.12 + 0.02E + (0.85652 — 0.00046) x ), (5) 


then for each n the difference |y,—y,’| will keep within the limit of the chosen 
above value for the confidence interval 5(y — y’), where a=0.94; 0 < € < 35; ge(y) = 
2a[y/(2a)] + 20 at | y - 2a [y/(20)]| = a, otherwise ga(y) = 2a [y/(2a)], [ b ] means 
integer part of b, Thus, another computative paradox occurs: 

although for each contaminated data array a family of analytical solutions 
exists, the only single solution of the estimation problems is found in modern 
regression analysis theory. 

b) puts the mentioned above extremal values of € in (5), one will be able to 
determine the exact limit of the variation for the linear model parameters ap and a: 
ag = 67.77 £ 0.35 and a; = 0.865 + 0.008; 

c) deletes a set of readings with numbers 1,7, 8 and 9 from D.I.Mendeleev data 
array, one will obtain that in the truncated data array {y,, x,}* the difference of 
| vn —yn’ | for each n keeps within the limit of the error €, where € is the measuring 
error for readings {y,}*: € = 0.1. Since in this case 5(y —y’) < €, the complete family 
of analytical solutions has form®’ 


¥(E, x) = gal 67.566 + 0,002 + (0.870047 — 0.000097E) x ), (6) 


where a=0.07; 0<E<45 and, consequently, a9 = 67.52140.045 and a, = 
0.872 + 0.002; 

d) compares solutions (5) and (6) with the standard LS-solution, one can 
conclude that LS-estimations of parameter ao and a; {A’ = (67.5+0.5; 0.87+0.2)} 
are pretty near equal of the mean values of these parameters in the general analytical 
solutions (6) and (7). However, 

values of variances dag’ and da; computed by standard method, disagree with 
exact values determined by (5). 
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2.3. The main paradox of regression analysis theory 


As it emerges from analysis of information presented in Sect.2.2, the main paradox 
of modem regression analysis theory is exhibited in a contradiction between this 
theory statements, which guarantee uniqueness of data analysis problems solution, 
and multivarious solutions in actual practice. In this section we adduce yet several 
computative manifestations of this paradox. 


Example 2. In table 2 a two-factors simulative data array is presented. 


Table 2. 
Simulative data array 


Let the approximative model have form 

y= (€ota,x+ayx (1 +a3x + agx). (7) 

To find vector parameter A estimates of the model (7) on the data array from 
table 2 we use two different estimation methods. As the first method we choose the 


estimation one, involved in the software CURVE-2.0, designed AISN. In this case 
we obtain, that 


A’= {0.81; 0.008; -0.31; -0.22; -0.24}. 
As the second estimation method we select Marquardt method’. Using the value 


A’, found above by the first estimation method, as initial value of A we obtain that 
in the second case 


A’ = {0.81; 0.55; 0.035; 0.45; 0.34}. 


Thus, 
values of A’ obtained by two different estimation methods, differ from each 
other. 


Example 3. In table 3 yet one two-factors data array is presented. Let us select 
the model y= a;x +e as approximative one and assume, that y is the random 
variable with the known density function p: 


p= exp{—(y— a,x) /(2f (az))}/J2nf (a2), (8) 


where f(a.)= a) a2; b) ax? 5; C) aox. 
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We will find estimates of parameters a; and a2 by method of maximum likelihood: 


N 
L= Tlexp{—-(y, —a,%,)}/(2f(a))/J2nf(ax) => max (9) 


n=1 


or dln L/ da;=0, where symbol “In” means the natural logarithm. 


Table 3. 
Two-factors data array 


a) b) c) 
Figure 3. Dependences {y, — a;’x,} for different hypothesis about Law for the random 
variable y variance. 


Computation results of V.I.Mudrov and V.P.Kushko’ show, that in the 
discussed case the estimates values of parameters a; depend on hypothesis about 
Law for the random variable y variance: for case (a) in (8) a;’= 4.938 (L’= 
4.995-10-"* ); for case (b) a;’= 4.896 (L’ =4.421-107'°) and for case (c) a’ = 4.927 
(L’ = 9.217-10-*). By analysing obtained results the authors * conclude, that, since 
likelihood function (9) has maximum values for case (a), the more likelihood 
hypothesis about Law for the random variable y variance is the hypothesis (a): 
variance of y is the constant value. 

We demonstrate in figure 3 that for cases (a), (b) and (c) dependences 
A y= {@n} = {vn —@i'Xn} have practically the same form and, consequently, 

the strong distinction of values L’ for all mentioned cases does not tread on 
infirm ground, 

It should be noted that 

— the very apparent expression of the discussed main computative paradox of 
regression analysis theory one may find also in books” ” !!, where, for the problem 
on finding the best linear multiple model, fitting Hald data array, a set of solutions, 
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found by various procedures and statistical tests of modern regression analysis 
theory, is adduced; 

— the most impressive formulation of the main paradox of regression analysis 
theory is contained in Y.P.Adler introduction ™*: 


“When the computation had arisen, the development of regression analysis 
algorithms went directly «up the stairs, being a descending road». Computer was 
improving and simultaneously new more advanced algorithms were yielded: whole 
regression method, step-by-step procedure, stepped method, etc., — it is impossible 
to name all methods. But again and again it appeared that all these tricks did not 
allow to obtain a correct solution. At least it became clear that in majority cases the 
regression problems belonged to a type of incorrect stated problems. Therefore 
either they can be regularised by exogenous information, or one must put up with 
ambiguous, multivarious solutions. So the regression analysis degraded ingloriously 
to the level of a heuristic method, in which the residual analysis and common sense 


of interpreter play the leading role. Automation of regression analysis problems 
came to a dead-lock”. 


3 Data analysis problems at unknown theoretical models 


Let us assume, that a researcher is to carry out a quantitative analysis of a data array 
{X,} in the absence of theoretical models. Further consideration will be based on the 
fact®’ that the described situation demands a solution of following problems 

— verification of the presence (or absence) of interconnections between analysed 
properties or phenomena; 

— determining (in the case when the interconnection is obvious, a priori and 
logically plausible) in what force this interconnection is exhibited in comparison 
with other factors affecting the discussed phenomena; 

— drawing a conclusion about the presence of a reliable difference between the 
selected groups of analysed objects; 

— revealing object's characteristics irrelevant to analysed property or 
phenomenon; 

— constructing a regression model describing interconnections between analysed 
properties or phenomena. 

In following sections we consider some methods allowing to solve foregoing 
problems. 


3.1 Correlation analysis 


When one is to carry out a quantitative analysis of the data array {X,,} in the absence 
of theoretical models, it is usual to apply correlation analysis at the earlier 
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investigation stage, allowing to determine the structure and force of the connections 
between analysed variables'*~ 1”. 

Let, for instance, in an experiment each n-th state of the object be characterised 
by a pair of its parameters y and x. If relationship between y and x is unknown, it is 
sometimes possible to establish the existence and nature of their connection by 
means of such simple way as graphical. Indeed, for realising this way, it is sufficient 
to construct a plot of the dependence {y,, x,} in rectangular coordinates y — x. In this 
case the plotted points determine a certain correlation field, demonstrating 
dependences x = x(y) and/or y = y(x) in a visual form. 

To characterise the connection between y and x quantitatively one may use the 
correlation coefficient R, determined by the equation °~ uv 


Tn — Y%q —¥) 
a eae oe (10) 
PIAA e ~ y) B= ibe —X) 


where y and x are the mean values of parameters y and x computed on all NV 


R 


readings of the array {yn, Xn}. It can be demonstrated that absolute value of R,x does 
not exceed a unit: -1<R,<1. 

If variables y and x are connected by a strict linear dependence y = ag + a,x, then 
R,, =+1, where sign of R,, is the same as that of the a; parameter. This can follow, 
for instance, from the fact that, using R,,, one can rewrite the equation for the 


regression line in the following form'*~ 


Y= V+ Ryx Sy/S:) (x -*), (11) 


where S, and S, are mean-square deviations of variables y and x respectively. 

In a general case, when —1 <R,,< 1, points {y,, x,} will tend to approach the line 
(11) more closely with increasing of | Ryx | value. Thus, correlation coefficient (10) 
characterises a linear dependence of y and x rather than an arbitrary one. To 
illustrate this statement we present in table 4 the values Ry = R,(a) for the 
functional dependence y=x®, determined on x-interval [0.5; 5.5] in 11 points 
uniformly. 


Table 4. 


The values R,, = R,,(a) for the functional dependence y=x’, 
determined on interval [0.5; 5.5] 
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Let us clear up a question what influence has the presence of outliers in the data 
array {¥,,xXn} on the value of correlation coefficient (10). To perform it let us 
analyse a data array 


{Xn} = (-4; -3; —2; -1; 0; 10), (12) 
{yn} = (2.48; 0.73; -0.04; -1.44; -1.32; 0), 


where, on simulation conditions 8 the reading with number 6 is an extremal outlier 
(such reading that contrasts sharply from others); approximative function F(A, X) = 
ay + a,x and Ame = (-2; -1). 

By computing the values of R,, of (10) and s of (1), we determine the number i 
of a reading, which elimination from this data array leads to the maximum absolute 
value of R,, and, consequently, to the minimum value of s (the most simple 
combinatoric-parametric procedure Ps, allowing to find one outlier®’ in a data 
array). Our calculations show that the desirable value R,, = — 0.979 and i=1. We 
note, if extremal outlier y¢ is removed from the array (12), Ry, = —0.960, but s of (1) 
takes the minimum value. Presented results enable us to state that 

procedure Ps loses its effectiveness when revealing the outlier is made not by 
test s, but by test Ryx. 

Let us consider another case. For the array (12) the noise array {e,} = {-2- x,—- 
yn } =(-0.48; 0.27; 0.04; 0.44; -0.68; -12.0). We reduce by half the first 5 
magnitudes of the noise array {e,}: {@n}new = (0.24; 0.14; 0.02; 0.22; -0.34; —12.0); 
form a new array {Va}new = {-2— Xn—(€n)new} and determine again the number i of a 
reading, which elimination from the data array {)n, Xn}new leads to the maximum 
absolute value of R,x. In the described case R,, reaches its maximum absolute value 
when the reading 6 (extremal outlier) is deleted from the array {yn, Xn}new (Ryx = - 
0.989). If from the array {yp, Xn}new We eliminate the reading 6, identified correctly 
by the test “the maximum absolute value of R,,”, then by this test we are able to 
identify correctly the sequent outlier (the reading 5) in the discussed array. Thus, we 
obtain finally 

when a dependence between the analysed variables is to a certain extent close to 
a linear, one may use the correlation coefficient (10) for revealing outliers, 
presented in data arrays. 

It is known! ~!’, when the number of analysed variables K > 2, the structure 


and force of the connections between variables x, x2, ...,xx are determined by 
computing all possible pairs of correlation coefficients Ryix; from (10). In this case 


all coefficients Rix; are usually presented in the form of a square symmetric K by 


K matrix: 


Re ane os Al (13) 


70 


which is called a correlation matrix (we note that in this matrix diagonal elements 
R;;=1). Finding strong-interconnected pairs of variables x), x2,...,xx on the 
magnitudes of coefficients Rj from the matrix R is a traditional use of matrix (13) in 
data analysis. But, obviously, 

using the mentioned way, one should bear in mind all ideas presented above in 
outline concerning the correlation coefficient (10). 


3.2 Discriminant analysis 


Let a certain object W be characterised by a value of its vector parameter Xy = (x1, 
X2, ... XK); Wi, W2, ..., Wp be p classes and the object W must be ranged in a class 
W; on the value of its vector parameter X,,. In discriminant analysis the formulated 
problem is the main one ‘*~”?, 


The accepted technique for solving the mentioned problem entails construction 
of a discriminant function D(A, X). A form and coefficients {a;} ;-1. p Values of 


this function are determined from the requirement, that values of D(A, X) must have 
maximum dissimilarity, if parameters of objects, belonging to different populations 
W,, W2, ..., Wp, are used as arguments of this function. 

It seems obvious that in a general case, firstly, D(A, X) may be either linear or 
non-linear function on {a;} and, secondly, must be some connection between the 
problem-solving techniques of discriminant and regression analyses. In particular, as 
stated '8 -7! for solving problems of discriminant analysis one may use standard 
algorithms and programs of regression analysis. Thus, the similarity of techniques, 
used for solving problems of the regression and discriminant analyses, makes it 
possible in discriminant analysis to apply alternative algorithms and procedures of 
regression analysis and, consequently, 

if data analysis problems are solved by discriminant analysis techniques then in 
practice the researcher may meet the same difficulties which are discussed in 
Sect. 2. 


3.3 Regression analysis 


In the absence of theoretical models it is usual to employ regression analysis in 
order to express in a mathematical form the connections existing between variables 
under analysis. 

It happens with extreme frequency that researchers impose limitations on a type 
and form of approximative models or, in other words, approximative models are 
often chosen from a given set of ones. Evidently, in this case it is required to solve 
problem on finding the best approximative model from a given set of models. Let, 
for instance, it is required to find the best approximative multinomial with a minimal 
degree. With this in mind, in two examples below we consider some accepted 
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techniques, used for solving the mentioned problem in approximation and/or 
regression analysis theories. 


Example 4. Let 
{xn} = {-140.2(n-1)}; f(&) =sinx; fn} = {LAS Gn) ]/ 4}, (14) 


where square brackets mean the integer part; m = 1,2, ... , 11; f(x) is a given 
function, used for generating the array {y,}; a factor k = 10° and its presence in (14) 
is necessary for “measuring” all values of y, within error € =107°. It is required for 
the presented dependence {y,, xn} to find the best approximative multinomial with a 
minimal degree. 


A. As well known in approximation theory”, if the type of the function f(x) is 
given and either (m+1)-derivative of this function weakly varies on the realisation 
{x,}, or on the x-interval [-1, 1] the function f(x) is presented in form of even- 
converging power series, then the problem of finding the best even-approximating 
multinomial for the discrete dependence {y,, x,} offers no difficulty. Indeed, in the 
first case the solution of problem is an interpolative multinomial P »(B, x) with a set 
of Chebyshev points (this multinomial is close to the best even-approximating one). 
In the second, case one may obtain the solution by the following economical 
procedure of an even-converging power series: 


1. Choose the initial part of truncated Taylor series, approximating the function 
f(x) within error €y < €, as the multinomial P y(B, x) (the multinomial with the 
degree M and vector parameter B); 


M-1 


2.Replace €% with e\% — | bar | {3 , Where by is a coefficient of the 


multinomial P 4,(B, x) at oe 
3. If € > 0, then replace the multinomial P y(B, x) with the multinomial 


Pyy-\(%) = Py (x)— (by 124) Ty (x), (15) 
where 7;{x) is Chebyshev multinomial: 7,=1, T,=x and when M 2 2 
Ty = 2xTy_,; — Ty_.. Then decrement M by one and go to point 2. If ey < 0, 
then go to point 4; 

4. End of computations: the multinomial P 4,(B, x) is the desirable one. 


By means of the foregoing economical procedure one may easy obtain that the 
multinomial with a minimal degree, even-approximating the function sinx, given 
within error € =10~ on the x-interval [—1, 1], has the following form 


P3(x) = (383/384) x— (5/32) x°. (16) 


B. Let 1<M<9 and in the multinomial P, (B, x)= £@54,,b,x” all Am= 1. By 
determining LS-estimates of vector parameter B for each value of M on the formed 
above array {vn,Xn}, we find that in all obtained approximative multinomials 
P,(B’, x), as well as in Taylor series of function sinx, the values of coefficients 
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b’,, =0 at /=0,1,..., 4. Thus, regression analysis of the discussed array {)p, Xn} 


allows to determine a form of the best approximative multinomial: 
3 
Po141 opt (Bx) = dix t bye t+... + rte alas eee (17) 


We note, if / = 1, then the values of parameters b; and b3, computed by regression 
analysis method (LS-method) for model (17), coincide with ones, shown in (16). 

We remind, that in classical variant of regression analysis theory the best 
approximative multinomial is chosen by the minimal value of the test s, = S)/(N—/), 
where S) is residual sum-of-squares, ha Tm ; Nis total number of readings; An 
is such characteristic number that A,, = 0, if the approximative multinomial P , (B, x) 
does not contain term b,x”, and A, = 1, otherwise. For approximative models (17) 
and / = 0, 1, 2 , 3 the computation values ofs are following 


| ROOT gee ee at Se 0 1 2 3 
Sn iatieahss aston oda 0.033 0.00063 0.00043 0.00054 


Since s has the minimal value at / = 2, for the discussed array in the frame of 
classical variant of regression analysis theory, the multinomial 


Ps(x) = bx + bx? + bsx® (18) 
is the best approximative one. 


C. Since, for each n in the data array (14), the difference of | y, —y,’ | must be 
kept within the limit of the error € =10°, the general solution of the discussed 
problem has the following form®’ 


V(E, x)=ga{(1.0012 — 0.0001€) x — (0.161200 — 0.000127E) x°}, (19) 


where © = 0.001; 0< & < 49 and, consequently, b; = 0.99874 0.0025 and 52 = — 
0.1582 + 0.0030. 

By analysing solutions (16), (18) and (19) we conclude that in the considered 
case 

the solution of the problem on finding the best fitting multinomial depends on 
the type of the used mathematical theory. 


Example 5. In some software products {for instance, in the different versions of 
software CURVE, designed by AISN} the solutions of problems on finding the best 
approximative models are found by the magnitude of a determination coefficient R, 
which value may be computed by a set of formulae 


R, = J1-0,/0, 0, = Dyn — ya)? O= Dyn - )” (20) 
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where y=S%_,y,/N, yn is n-th reading of dependent variable, y/, is n-th value of 


dependent variable, computed on the fitting model; or by formula (firstly offered by 
K.Pearson) 


TN nV Dn - VY” (21) 


{evidently, one may easy obtain formula (21) from formula (10)}. 


Rp = 


Table 5. 


The simulative data array to example 5 


3.8136396 
3.5774037 
3.4193292 
3.2903451 
3.1026802 


There is a mathematical proof *’ of the equivalence of formulae (20) and (21). 
But, if the value of coefficient R2” is computed within error > 1078, 

in actual practice, for some data arrays, firstly, Ry’ #Ry and, secondly, Ry > 1. 

For instance, if one fits the simulative data array, presented in table 5, by the 
multinomial Py(B, x) with M = 8, then software CURVE-2.0 will give the value R27 
= 1.00040. 


4 Problems of quantitative processing experimental dependences 
found for heterogeneous objects 


*2S" in practice at analysis of 


As it follows from the general consideration? 
experimental dependences found for heterogeneous objects, three various situations 
can be realised: the heterogeneity of investigated objects causes a) no effect; b) a 
removable (local) inadequacy of postulated fitting model; c) an irremovable (global) 
inadequacy of the postulated model. In this section we discuss some computative 
difficulties which may occur at analysis of the mentioned experimental 


dependences. 


Example 6. As we know from Sect. 2.1 if F(A, x) is a linear model {F(A, x) = 
ye ,azhj(x) } then the value of A’, minimising residual sum-of-squares S, is 
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computed by (1). Let rank H < ZL, or, in other words, there is a linear dependence 
between columns of matrix H: 


Cay toh t+...t+crh,=0, (22) 


where at least one coefficient c;# 0. In this case matrix (H "Hy" does not exist, that 
means one cannot find A’ from (1). Such situation is known as strict 
multicollinearity. 

In the natural science investigations values of the independent variable X are 
always determined with a certain round-off error, although this error may be very 
small. Therefore, if even strict multicollinearity is present, in practice the equation 
(22) is satisfied only approximately and therefore rank H = L. In such situation 
application of equation (1) to find the estimate of vector parameter A gives A’ 
values drastically deviating from true coefficients values va cage 

To correct this situation in regression on characteristics roots 6 it is suggested to 
obtain the information about the grade of matnx H'H conditioning from values of 
its eigennumbers A, and first elements Vo; of its eigenvector V; and to exclude from 
regression such j-components, whose eigennumbers A; and elements Vo; are small. 
Following values are recommended to use as critical ones: Acr = 0.05 and Vo, = 0.1. 

Let us demonstrate, that 

in some practical computations the difference between A “cyr and A ‘s of (1) can 
be explained not by the effects of multicollinearity, but by regression model 
inadequacy, which disappears simultaneously with the effects of multicollinearity 
after removing outliers. 

Indeed, let data array be following 


{yn, Xn} ={1 + 0.5 2 + 0.05 n* + 0.005 n°; n,n’, n°}, (23) 


n=1,2,..., 11 and we introduce two outliers in (23), by means of increasing values 
y3 and yg on 0.5. For this data array we obtain the following computation results: 


Neep= 1,N=11: A’ts = (1.017; 0.542; 0.0462; 0.0050), 
na= (3, 8} A’ cur = (1.046; 0.516; 0.0516; 0.0047); 
Neep=2,N=10: A’ts = (0.764; 0.801; 0.0169; 0.0090), 
na= {3} A’ cur = (0.764; 0.801; -0.0169; 0.0090), 


where Neep is a number of the step in used computative procedure; n, is a vector to 
indicate the numbers of anomalous readings, contained in analysing array on the 
first and second steps of used computative procedure; N is the general quantity of 
analysing readings. In particular, after the first step of computative procedure from 
(23) the reading with number 8 is removed; after the second step — readings with 
numbers 3 and 8. And after the second step the values of A’ are restored without any 
distortion by both examined algorithms. 
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By analysing obtained computation results one can conclude that 


the difference between Ais and A’cur may be caused not only by 
multicollinearity but also, for instance, by a set of outliers presented in the data 


array. 


Example 7. In table 6 we adduce an experimental data array, obtained by 
N.P.Bobrysheva’’, when she investigated magnetic susceptibility (x, relative units) 
of polycrystalline system V,Al,_,O; 5 (x = 0.078) on the temperature (7, K). Let us 


consider some computation results of quantitative processing this temperature 


dependence. 


Table 6. 


The experimental dependence of magnetic susceptibility of system 
VyAl;.xO; 5 (x = 0.078) on temperature 


1} 


30 230 430 630 


a) 


Figure 4. Experimental (circles) and analytical (continuous curves) plots of dependences 


1/g + 0,542) 


0,6 


0,3 


30 230 430 630 


b) 


x (7) (a) and 1(y + x%2)—- T(b) for system Vy Aly.,O1 5 (x = 0.078). 


A. In figure 4(a, 5) for the discussed system experimental (circles) and analytical 
(continuous curves) plots of dependences x (7) and 1/(y + x2) — T are shown. For 
construction of analytical (continuous) curves we use modified Curie — Weiss 


law 6, 7, 24, 25 
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X =Xot C/(T + 6), (24) 


where x is the experimental magnitude of specific magnetic susceptibility; 7 is 
absolute temperature, K;-C, 6 and xo are parameters: C= 988; 6 = 14, K; xo= 0.54. 
From analysing graphical information, presented in figure 4(a, b), one can conclude, 
that 

the magnetic behaviour of system V,Al;_,O;.5 (x = 0.078) is well explained by 


the modified Curie — Weiss law (24). 


B. In figure 5(a) the dependence Ay = x — C(T + 6)-—xo0 on T for system 
VyAlj.xO1,5 (x = 0.078) is shown. Since AY max = 0.2 >> €=0.01, where € is the 
measurement error in the diccussed experiment, we obtain 

in contradiction with the statement of point (A) in this case modified Curie — 
Weiss law (24) is an inadequate approximative model or, in other words, there is a 
set of outliers in the analysing experimental dependence. 


C. After deleting first 5 readings from the initial data array the parameters values 
of modified Curie — Weiss law (24) have magnitudes C= 1386; 6 = 89, K; yo= 0.14. 
The plot of dependence Ax = x — C/(T + 8) — Xo with foregoing parameters values is 
shown in figure 5(d). 


a) b) 
Figure 5. Plots of Ay =x-C/(T+8)-—x_ on T forsystem V,Al;_,0j 5 (x = 0.078). 


Analysing plots Ay(7), presented in figure 5(a, 5), and comparing with each 
other the parameters values of equation (24), mentioned in points (A) and (C), we 
conclude, that 

neglect of the local inadequacy of the approximative model in the discussed 
experiment leads to distortion of both form of function Ax(T) and parameters values 
of the modified Curie — Weiss law (24). 

Thus, if, for proving well-fitted properties of equation (24), researchers?” *° 
suggest to look at the graphic representation of dependences ¥ (7) or 1/x (7), for the 
proof completeness one should ask these researchers to present information about 
the measurement error of values x and plots A(T) = x — C/(T+®) - xo. 
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For the sake of convenience, the main causes, given rise to computative 
difficulties at analysis of experimental dependences found for heterogeneous 
objects, and methods of their overcoming are adduced in table 7 together. In this 
table all methods, overcoming computative difficulties, are marked by the symbol 
©, if at present they are in the rough or absent in modern data analysis theory. 


Table 7. 
Main causes and overcoming methods of computative difficulties in 
modern data analysis theory 
Main causes Methods of overcoming 


Impossibility to take heed of preset 
measurement accuracy of dependent 
variable values in the frame of 
accepted data analysis model 


2. | Limited accuracy of computations Increasing computation accuracy 


3. | Point estimation of parameters Replacing the point estimation of 
parameters by interval one 


© Modification of data analysis model 


4. | The deficient measurement accuracy Increasing measurement accuracy 
of dependent variable values of dependent variable values 
5. | Ill-conditioning of estimation problem © Using alternative estimations methods; 


Increasing measurement accuracy of 
dependent variable values; 
© Revealing and removing outliers; 
Designing experiments 
6. | Presence of outliers in analysing data  |© Revealing and removing outliers; 
airays © Robust estimation of parameters 
Inadequacy of approximative model © Eliminating inadequacy of 
approximative model; 
© Using advanced estimations 
methods 


8. | Finding only single solution of the © Finding a family of solutions 
estimation problems for contaminated 
data array in the frame of modem data 
analysis theories 


Using information presented in table 7, let us clear up a question, whether one is 
able in the frame of modern data analysis theory to obtain reliable solutions for the 
problems of quantitative processing of experimental dependences, found for 
heterogeneous objects. 

Let, when an investigated object is homogeneous, a connection between 
characteristics y and X exist and it be close to functional one: y = F(A, X). As we 
said already in beginning of this section, in the discussed experiments three various 
situations can be realised: the structural heterogeneity 
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1) has no effect on the experimental dependence {y,, X,} or, in other words, in 
this case it is impossible to distinguish the homogeneous objects from 
heterogeneous ones on the dependence {yn, Xn}; 

2) leads to a distortion of the dependence {yn,X,} in some small region 
{X,,}C {Xn} {the approximative model F(A,X) has removable (local) 


inadequacy}. In this case for extracting effects, connected with the presence of a 
homogeneity in the investigated objects, one may use the following way” ’ 
i) solve the problem on revealing outliers {y,,, X,,}; 
ii) determine the value A’ on readings {y,, Xn} \ {Vn >Xn,} {we remind, that 
a set of {Yn, Xn} \ {y,, ,X,,} is to be well-fitted by the model F(A, X)}; 


iii) detect a type and degree of the effects, connected with the presence of a 
homogeneity in the investigated objects, on the data array {y,, — F(A’, X,,): X.,)}- 
It follows from point 6 of table 7, that at solving problem (7) in actual practice 
some difficulties, which are unsurmountable in the frame of modem regression 
analysis theory, can be arisen; 
3) leads to a distortion of the dependence {y,,, X,} in a big region {X,,} © {Xn}: 


{the approximative model F(A, X) has irremovable (global) inadequacy}. 

It follows from point 7 of table 7, that in this case it is impossible to find a 
reliable solution of the discussed problem in the frame of modern data analysis 
theory. 

Summarising mentioned in points (1) — (3), we conclude 

since at present the methods, marked by the symbol O in table 7, are not effective 
for overcoming computative difficulties or absent in modern data analysis theory, 
one is not able to obtain reliable solutions for the problems of quantitative 
processing of experimental dependences found for heterogeneous objects. 

From our point of view, one of possible ways, overcoming computative 
difficulties in modern data analysis theory, is further development of this theory by 
means of translation of this theory from Aristotelian “binary logic” into more 
progressive “fuzzy logic” ® "242533, 
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